In [1]:
%load_ext load_style
%load_style talk.css
This notebook only show some basic analysis on precipitation data based on previous notebooks
Precipitation data
Precipitation data is downloaded from GPCC Global Precipitation Climatology Centre (https://www.esrl.noaa.gov/psd/data/gridded/data.gpcc.html).
Here, let's see how to open netcdf data in python and generate yearly climatology of global precipitation. First, we will import the required libraries, then we will open the monthly average global precipitaton.
In [2]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt # to generate plots
from mpl_toolkits.basemap import Basemap # plot on map projections
import matplotlib.dates as mdates
import datetime
from netCDF4 import Dataset # http://unidata.github.io/netcdf4-python/
from netCDF4 import netcdftime
from netcdftime import utime
In [3]:
#!wget ftp://ftp.cdc.noaa.gov/Datasets/gpcc/full_v7/precip.mon.total.v7.nc
file = 'data\precip.mon.total.v7.nc'
variable = 'precip'
In [4]:
fh = Dataset(file, mode='r') # file handle, open in read only mode
lon = fh.variables['lon'][:]
lat = fh.variables['lat'][:]
nctime = fh.variables['time'][:]
t_unit = fh.variables['time'].units
pr = fh.variables[variable][:]
try :
t_cal = fh.variables['time'].calendar
except AttributeError : # Attribute doesn't exist
t_cal = u"gregorian" # or standard
fh.close() # close the file
If you want to learn more about what a specific function does, you can use the help function: help(function) or function?.
For example, help(np.mean) or netcdftime.utime?
In [5]:
#netcdftime.utime?
#netcdftime.utime.num2date?
utime = netcdftime.utime(t_unit, calendar = t_cal)
datevar= utime.num2date(nctime)
datevar[0:5]
Out[5]:
In [6]:
lat
Out[6]:
In [7]:
lat[0] # Caution! Python’s indexing starts with zero
Out[7]:
In [8]:
lat[-1] # gives the last value of the vector
Out[8]:
In [9]:
pr[0,:,:] # or simply pr[0]
Out[9]:
In [10]:
pr[0].shape # (360, 720)
Out[10]:
In [11]:
plt.imshow(pr[0]) #data for the first time-step (January 1901)
plt.title('Jan 1901 Monthly Rainfall [mm]')
Out[11]:
The matplotlib basemap is used here, which is a library for plotting 2D data on maps in Python. It is similar in functionality to the matlab mapping toolbox, the IDL mapping facilities, GrADS, or the Generic Mapping Tools. PyNGL and CDAT are other libraries that provide similar capabilities in Python (https://matplotlib.org/basemap/users/intro.html).
basemap supports many projecitons:
In [12]:
lons, lats = np.meshgrid(lon,lat)
m = Basemap(projection='kav7', lon_0=0)
m.drawmapboundary(fill_color='Gray', zorder=0)
m.drawparallels(np.arange(-90.,99.,30.), labels=[1,0,0,0])
m.drawmeridians(np.arange(-180.,180.,60.), labels=[0,0,0,1])
h = m.pcolormesh(lons,lats,pr[0], shading='flat',latlon=True)
m.colorbar(h, location='bottom', pad="15%", label='Rainfall [mm/month]')
plt.title('Jan 1901 Monthly Rainfall [mm]')
Out[12]:
In the previous section we noticed that our precipitation data-set only has values over land. This is where masked arrays come into play.
In [13]:
#pr # Python automatically created a masked array with the data and the mask
In [14]:
pr.data # gives you the data array only
Out[14]:
In [15]:
pr.mask # gives you True where the data is masked (over the ocean) and False where data is available (over land)
Out[15]:
In [16]:
pr.data[~pr.mask] # the tilde inverts the mask
pr.data
Out[16]:
In [17]:
clim = np.mean(pr, axis=0) # because pr is a masked array, all masked values are ignored in the calculation.
clim.shape
Out[17]:
'clim' can be plotted onto a map projection the same way you plotted 'pr[0]' above.
In [18]:
lons, lats = np.meshgrid(lon,lat)
m = Basemap(projection='kav7',lon_0=0)
m.drawmapboundary(fill_color='Gray', zorder=0)
m.drawparallels(np.arange(-90.,99.,30.), labels=[1,0,0,0])
m.drawmeridians(np.arange(-180.,180.,60.), labels=[0,0,0,1])
h = m.pcolormesh(lons,lats,clim, shading='flat',latlon=True)
m.colorbar(h,location='bottom',pad="15%", label='Rainfall [mm/month]')
plt.title('Yearly Rainfall Climatology')
Out[18]:
In [19]:
wgtmat = np.cos(np.tile(abs(lat[:,None])*np.pi/180, (1,len(lon)))) # Dimension: 72x144
In [20]:
mean_pr = np.zeros(pr.shape[0]) # Preallocation
for i in range(pr.shape[0]): # Don’t forget the ‘:’
mean_pr[i] = np.sum(pr[i] * wgtmat * ~pr.mask[i])/np.sum(wgtmat * ~pr.mask[i])
In [21]:
prunit = 'mm/month'
In [22]:
fig = plt.figure(figsize=(15, 6))
ax = fig.add_subplot(111)
plt.plot(range(1,pr.shape[0]+1), mean_pr, 'b-', label='precip',linewidth=2)
plt.axis([1,pr.shape[0]+1, np.min(mean_pr)-0.05, np.max(mean_pr)+0.05])
plt.xlabel('Month')
plt.ylabel('Precipitation Rate [' + prunit + ']')
plt.title('Monthly Mean Precipitation Rate Over Land (1901-2013)',fontsize=12)
plt.legend(loc=2, borderaxespad=1., frameon=False)
#plt.grid(True)
ax.autoscale_view()
plt.savefig('image/monthly_global_precip.png')
In [23]:
fig, ax = plt.subplots(1)
fig.set_figwidth(15)
fig.set_figheight(6)
# plot mean precipitation
ax.plot(datevar, mean_pr)
# rotate and align the tick labels so they look better
fig.autofmt_xdate()
# use a more precise date string for the x axis locations in the toolbar
ax.fmt_xdata = mdates.DateFormatter('%Y-%m-%d')
# add some docoration
plt.xlabel('Month')
plt.ylabel('Precipitation Rate [' + prunit + ']')
plt.title('Monthly Mean Precipitation Rate Over Land (1901-2013)',fontsize=12)
ax.autoscale_view()
In [24]:
np.savez('data/monthlylmeanpr.npz', prunit=prunit, dates=datevar, mean_pr=mean_pr)
Firstly, convert mean_pr data from [months] to [nyr|12|], where months=nyrx12 months/year. This can be done using the reshape function.
Secondly, calculate means of all months whithin a year, which can be done using np.mean along the axis=1.
In [25]:
ntimes = mean_pr.shape[0]
nyr = ntimes/12
nyr
Out[25]:
In [26]:
mean_pr_reshaped = mean_pr[0:ntimes].reshape((nyr,12))
mean_pr_yr = np.mean(mean_pr_reshaped, axis=1)
mean_pr_yr # print the results to the screen
Out[26]:
In [27]:
years = [idx.year for idx in datevar]
years = years[1::12]
print(years[-1])
years[0:13]
Out[27]:
In [28]:
fig, ax = plt.subplots(1)
fig.set_figwidth(15)
fig.set_figheight(6)
# plot mean precipitation
ax.plot(years, mean_pr_yr)
# add some docoration
plt.xlabel('Year')
plt.ylabel('Precipitation Rate [' + prunit + ']')
plt.title('Yearly Mean Precipitation Rate Over Land (1901-2013)',fontsize=12)
#plt.grid(True)
ax.autoscale_view()
we can save and load several arrays into a single file in uncompressed .npz format using np.savez.
The .npz file format is a zipped archive of files named after the variables they contain. The archive is not compressed and each file in the archive contains one variable in .npy format. For a description of the .npy format, see numpy.lib.format or the NumPy Enhancement Proposal http://docs.scipy.org/doc/numpy/neps/npy-format.html
When opening the saved .npz file with load a NpzFile object is returned. This is a dictionary-like object which can be queried for its list of arrays (with the .files attribute), and for the arrays themselves.
In [29]:
np.savez('data/annualmeanpr.npz', prunit=prunit, years=years, mean_pr_yr=mean_pr_yr)
In [30]:
npzfile = np.load('data/annualmeanpr.npz')
In [31]:
npzfile.files
Out[31]:
In [32]:
npzfile['years'][0:13]
Out[32]:
http://unidata.github.io/netcdf4-python/
John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55
Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37
Kalnay et al.,The NCEP/NCAR 40-year reanalysis project, Bull. Amer. Meteor. Soc., 77, 437-470, 1996.
In [ ]: